Skip to content

[sql-50] session migration duplicate ID fix #1133

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

ViktorTigerstrom
Copy link
Contributor

The logic to avoid duplicate session IDs was first added with the session linking linking functionality. That means that there may be old legacy sessions that were added prior to that which may have duplicate IDs. Before this commit, such sessions would cause the kvdb to SQL migration to error, as there is a uniqueness constraint on the legacy session alias in the SQL sessions table.

We want to keep that constraint, so in such rare cases, we update the kvdb to SQL migration logic to drop all but the session with the latest CreatedAt time if for sessions that share the same ID.
That follows the logic in c8b78bd as closely as possible, as that migration ensured that all sessions but the latest one was revoked if multiple sessions shared the same ID.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @ViktorTigerstrom, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses an issue in the KVDB to SQL session migration process where legacy sessions with duplicate IDs could cause migration failures due to a uniqueness constraint in the SQL database. The changes introduce logic to identify and resolve these duplicates by migrating only the session with the most recent creation timestamp, thereby ensuring data integrity and successful migration.

Highlights

  • Duplicate Session ID Resolution: Implemented a mechanism within the KVDB to SQL migration to detect and handle sessions with identical IDs.
  • Data Integrity Preservation: Ensures the uniqueness constraint on session IDs in the SQL database is maintained by dropping older duplicate sessions.
  • Migration Logic Enhancement: Modified the migration process to prioritize and retain the session with the latest CreatedAt timestamp when duplicates are encountered.
  • Code Refactoring: Extracted session filtering logic into a dedicated filterSessions function for improved modularity and readability.
  • Comprehensive Testing: Added new test cases, including a specialized helper function, to thoroughly validate the duplicate session ID handling during migration.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a potential issue in the kvdb to SQL session migration where legacy sessions could have duplicate IDs, causing a uniqueness constraint violation. The proposed solution is to detect such duplicates and keep only the session with the most recent CreatedAt timestamp, which is a sensible approach. The implementation is well-contained within a new filterSessions function, and a comprehensive test case has been added to cover this scenario. The refactoring of existing tests to support this change is also well-executed. I have a couple of suggestions to further improve the code. One is to refactor the filterSessions function for better clarity and maintainability. The other is a minor simplification in a new test helper function. Overall, this is a solid contribution that improves the robustness of the migration process.

In the upcoming commit, we will update the kvdb to sql migration to not
always include all sessions in the kvdb store. Therefore, we update the
kvdb to migration tests in order to be able to handle such cases by
including the expected results for each test case.
@ViktorTigerstrom ViktorTigerstrom force-pushed the 2025-08-session-migration-duplicate-id-fix branch 2 times, most recently from 0ea7e9f to 566b76a Compare August 20, 2025 11:52
@ViktorTigerstrom ViktorTigerstrom added the no-changelog This PR is does not require a release notes entry label Aug 20, 2025
@ViktorTigerstrom ViktorTigerstrom self-assigned this Aug 20, 2025
Copy link
Contributor

@bitromortac bitromortac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM ⚡

Comment on lines 101 to 116
// Sort sessions by creation time in descending order to
// find the latest one.
sort.Slice(sessions, func(i, j int) bool {
return sessions[i].CreatedAt.After(
sessions[j].CreatedAt,
)
})
sessionToKeep = sessions[0]

// Log the sessions that will be dropped.
for _, s := range sessions[1:] {
log.Warnf("Dropping duplicate session with ID "+
"%x created at %v", id, s.CreatedAt)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very nitty (I think the current approach is fine as well): we could just determine the max timestamp, it also has the benefit of being stable for testing if two sessions are present with the same timestamp, which we don't have currently

    // Find the session with the latest timestamp in a single pass.
    latestSession := sessions[0]
    for _, s := range sessions[1:] {
        if s.CreatedAt.After(latestSession.CreatedAt) {
            latestSession = s
        }
    }
    sessionToKeep = latestSession

    // Log the sessions that will be dropped.
    for _, s := range sessions {
        if s == sessionToKeep {
            continue
        }
        log.Warnf("Dropping duplicate session with ID %x created at %v",
            id, s.CreatedAt)
    }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could also traverse the sessions in reverse order (slices.Reverse) (assuming the kv store order is in creation time order), to only use the latest created if two timestamps are the same. I know this is totally unlikely :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! Updated to the above approach :)!

we could also traverse the sessions in reverse order

Unfortunately that'd require that we did sort the entire kvsessions list, which would remove any efficiency gained from that. If we did sort it and then traversed in reverse order, we could do everything in just one pass as we wouldn't need to iterate over sessionsByID map since we'd know that if we hit a session with duplicate id, it wouldn't be the latest session with that ID. But since sorting the entire kvSessions list + 1 extra iteration would be slower than just iterating over it twice, I think that wouldn't make sense :)

@ViktorTigerstrom ViktorTigerstrom force-pushed the 2025-08-session-migration-duplicate-id-fix branch 2 times, most recently from d61a108 to d20c250 Compare August 21, 2025 14:00
The logic to avoid duplicate session IDs was first added with the
session linking linking functionality. That means that there may be old
legacy sessions that were added prior to that which may have duplicate
IDs. Before this commit, such sessions would cause the kvdb to SQL
migration to error, as there is a uniqueness constraint on the legacy
session alias in the SQL sessions table.

We want to keep that constraint, so in such rare cases, we update the
kvdb to SQL migration logic to drop all but the session with the
latest CreatedAt time if for sessions that share the same ID.
That follows the logic in c8b78bd as
closely as possible, as that migration ensured that all sessions but the
latest one was revoked if multiple sessions shared the same ID.
@ViktorTigerstrom ViktorTigerstrom force-pushed the 2025-08-session-migration-duplicate-id-fix branch from d20c250 to dfe0fa8 Compare August 21, 2025 14:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
no-changelog This PR is does not require a release notes entry
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants